Goto

Collaborating Authors

 vt 1


Active Learning of Classifiers with Label and Seed Queries

Neural Information Processing Systems

We study exact active learning of binary and multiclass classifiers with margin. Given an n-point set X Rm, we want to learn an unknown classifier on X whose classes have finite strong convex hull margin, a new notion extending the SVM margin.



Appendices Contents Appendices 18

Neural Information Processing Systems

Diplomacyisacomplex environment, where training requires significant time. The action is an allocation of the player's coins across the fields: the player decides how manyof itsccoins to put in each of the fields, choosing c1,c2,...,cf where Pf Finally, Blotto is a single-turn (i.e.




Prior-independentDynamicAuctionsfora Value-maximizing Buyer

Neural Information Processing Systems

Automatic bidding has become one of the main options for advertisers to buy advertisement opportunities intheonline advertising market[Dolan, 2020]. Theprevalence ofautomatic bidding is partly driven by the fact that it significantly simplifies the interaction between the advertisers and theadvertisingplatform.


SupplementaryMaterials AProofofTheorem2: AsymptoticConvergenceofRobustQ-Learning

Neural Information Processing Systems

From[BorkarandMeyn,2000],weknowthatthestochastic approximation (18) converges to the fixed point ofT, i.e., Q . Finally, to show Theorem 3, we only need to show each term in(56) is smaller than . In this section we develop the finite-time analysis of the robust TDC algorithm. We note that recently there are several works [Srikant and Ying, 2019, Xu and Liang, 2021, Kaledin et al., 2020] on finite-time analysis of RL algorithms that do not need theprojection. Specifically, the problem in [Srikant and Ying, 2019] is for one time scalelinear stochastic approximation.


FullyUnconstrainedOnlineLearning

Neural Information Processing Systems

We provide a technique for online convex optimization that obtains regret G w Tlog( w G T)+ w 2 +G2 on G-Lipschitz losses for any comparison pointw without knowing eitherG or w .


Adaptive Federated Optimization

arXiv.org Machine Learning

Federated learning is a distributed machine learning paradigm in which a large number of clients coordinate with a central server to learn a model without sharing their own training data. Due to the heterogeneity of the client datasets, standard federated optimization methods such as Federated Averaging (FedAvg) are often difficult to tune and exhibit unfavorable convergence behavior. In non-federated settings, adaptive optimization methods have had notable success in combating such issues. In this work, we propose federated versions of adaptive optimizers, including Adagrad, Adam, and Yogi, and analyze their convergence in the presence of heterogeneous data for general nonconvex settings. Our results highlight the interplay between client heterogeneity and communication efficiency. We also perform extensive experiments on these methods and show that the use of adaptive optimizers can significantly improve the performance of federated learning.


Nonmyopic Gaussian Process Optimization with Macro-Actions

arXiv.org Machine Learning

This paper presents a multi-staged approach to nonmyopic adaptive Gaussian process optimization (GPO) for Bayesian optimization (BO) of unknown, highly complex objective functions that, in contrast to existing nonmyopic adaptive BO algorithms, exploits the notion of macro-actions for scaling up to a further lookahead to match up to a larger available budget. To achieve this, we generalize GP upper confidence bound to a new acquisition function defined w.r.t. a nonmyopic adaptive macro-action policy, which is intractable to be optimized exactly due to an uncountable set of candidate outputs. The contribution of our work here is thus to derive a nonmyopic adaptive epsilon-Bayes-optimal macro-action GPO (epsilon-Macro-GPO) policy. To perform nonmyopic adaptive BO in real time, we then propose an asymptotically optimal anytime variant of our epsilon-Macro-GPO policy with a performance guarantee. We empirically evaluate the performance of our epsilon-Macro-GPO policy and its anytime variant in BO with synthetic and real-world datasets.